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We describe some applications of quantum information 
theory to the analysis of quantum limits on measurement sen- 
sitivity. A measurement of a weak force acting on a quantum 
system is a determination of a classical parameter appearing 
in the master equation that governs the evolution of the sys- 
tem; limitations on measurement accuracy arise because it is 
not possible to distinguish perfectly among the different pos- 
sible values of this parameter. Tools developed in the study 
of quantum information and computation can be exploited 
to improve the precision of physics experiments; examples in- 
clude superdense coding, fast database search, and the quan- 
tum Fourier transform. 



I. INTRODUCTION: DISTINGUISHABILITY OF 
SUPEROPERATORS 

The exciting recent developments in the theory of 
quantum information and computation have already es- 
tablished an enduring legacy. The two most far-reaching 
results — that a quantum computer (apparently) can 
solve problems that will forever be beyond the reach of 
classical computers and that quantum information 
can be protected from errors if properly encoded |^] — 
have surely earned a prominent place at the foundations 
of computer science. 

The implications of these ideas for the future of physics 
are less clear, but we expect them to be profound. In par- 
ticular, we anticipate that our deepening understanding 
of quantum information will lead to new strategies for 
pushing back the boundaries of quantum-limited mea- 
surements. Quantum entanglement, quantum error cor- 
rection, and quantum information processing can all be 
exploited to improve the information-gathering capabil- 
ity of physics experiments. 

In a typical quantum- limited measurement, a classical 
signal is conveyed over a quantum channel Nature 
sends us a message, such as the value of a weak force, 
that can be regarded as a classical parameter appearing 
in the Hamiltonian of the apparatus (or more properly. 



*CALT-68-2217 
tamchilds@caltech.edu 
tpreskillStheory . caltech . edu 
^renesOits . caltech. edu 



if there is noise, its master equation). The apparatus un- 
dergoes a quantum operation $(a), and we are to extract 
as much information as we can about the parameter(s) 
a by choosing an initial preparation of the apparatus, 
and a positive-operator-valued measure (POVM) to read 
it out. Quantum information theory should be able to 
provide a theory of the distinguishabihty of superopera- 
tors, a measure of how much information we can extract 
that distinguishes one superoperator from another, given 
some specified resources that are available for the pur- 
pose. This distinguishabihty measure would characterize 
the inviolable limits on measurement precision that can 
be achieved with fixed resources. 

Many applications of quantum information theory in- 
volve the problem of distinguishing nonorthogonal quan- 
tum states. For example, a density operator pa is chosen 
at random from an ensemble £ = {pa^Pa} (where Pa is 
an a priori probability), and a measurement is performed 
to extract information about which pa was chosen. The 
problem of distinguishing superoperators is rather dif- 
ferent, but the two problems are related. For example, 
let us at first ignore noise, and also suppose that the 
classical force we are trying to detect is static. Then 
we are trying to identify a particular time-independent 
Hamiltonian Ha that has been drawn from an ensemble 
{Ha, Pa}- We may choose a particular initial pure state 
\ipo), and then allow the state to evolve, as governed by 
the unknown Hamiltonian, for a time t; our ensemble 
of possible Hamiltonians generates an ensemble of pure 
states 
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Since our goal is to gain as much information as pos- 
sible about the applied Hamiltonian, we should choose 
the initial state l-^o) so that the resulting final states are 
maximally distinguishable. 

There are many variations on the problem, distin- 
guished in part by the resources we regard as most valu- 
able. We might have the freedom to chose the elapsed 
time as we please, or we might impose constraints on t. 
We might have the freedom to modify the Hamiltonian 
by adding an additional "driving" term that is under our 
control. We might use an adaptive strategy, where we 
make repeated (possibly weak) measurements, and our 
choice of initial state or driving term in later measure- 
ments takes into account the information already col- 
lected in earlier measurements Q . 

Imposing an appropriate cost function on resources is 
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an important aspect of the formulation of the problem, 
particularly in the case of the detection of a static (DC) 
signal. For example, we could in principle repeat the 
measurement procedure many times to continually im- 
prove the accuracy of our estimate. In this respect, the 
problem of distinguishing superoperators does not have 
quite so fundamental a character as the problem of dis- 
tinguishing states, as in the latter case the no-cloning 
principle Q prevents us from making repeated measure- 
ments on multiple copies of the unknown state. But for 
a time-dependent signal that stays "on" for a finite du- 
ration, there will be a well-defined notion of the optimal 
strategy for distinguishing one possible signal from an- 
other, once our apparatus and its coupling to the classical 
signal have been specified. Still, for the sake of simplic- 
ity, we will mostly confine our attention here to the case 
of DC signals. 

We don't know exactly what shape this nascent theory 
of the distinguishability of superoperators should take, 
but we hope that further research can promote the de- 
velopment of new strategies for performing high-precision 
measurements. On the one hand we envision a program 
of research that will be relevant to real laboratory situ- 
ations. On the other hand, we seek results that are to 
some degree robust and general (not tied to some partic- 
ular model of decoherence, or to a particular type of cou- 
pling between quantum probe and classical signal) . Nat- 
urally, there is some tension between these two central 
desiderata; rather than focus on a specific experimental 
context, we lean here toward more abstract formulations 
of the problem. 

Our discussion is far from definitive; its goal is to invite 
a broader community to consider these issues. We will 
mostly be content to observe that some familiar concepts 
from the theory of quantum information and computa- 
tion can be translated into tools for the measurement 
of classical forces. Some examples include superdense 
coding, fast database search, and the quantum Fourier 
transform. 

Naturally, the connections between quantum informa- 
tion theory and precision measurement have been recog- 
nized previously by many authors. Especially relevant 
is the work by Wootters by Braunstein 0], and by 
Braunstein and Caves [|j on state distinguishability and 
parameter estimation, and by Braginsky and others 
on quantum nondemolition measurement. Though what 
we have to add may be relatively modest, we hope that 
it may lead to further progress. 



II. SUPERDENSE CODING: IMPROVED 
DISTINGUISHABILITY THROUGH 
ENTANGLEMENT 

Recurring themes of quantum information theory are 
that entanglement can be a valuable resource, and that 
entangled measurements sometimes can collect more in- 
formation than unentangled measurements. It should not 



be surprising, then, if the experimental physicist finds 
that the best strategies for detecting a weak classical sig- 
nal involve the preparation of entangled states and the 
measurement of entangled observables. 

Suppose, for example, that our apparatus is a single- 
qubit, whose time-independent Hamiltonian (aside from 
an irrelevant additive constant), can be expressed as 
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here a ~ {ai, a2, a^) is an unknown three-vector, and 
fi,2.3 are the Pauli matrices. (We may imagine that a 
spin-^ particle with a magnetic moment is employed to 
measure a static magnetic field.) By preparing an ini- 
tial state of the qubit, allowing the qubit to evolve, and 
then performing a single measurement, we can extract at 
best one bit of information about the magnetic field (as 
Holevo's theorem ensures that the optimal POVM in 
a two-dimensional Hilbert space can acquire at most one 
bit of information about a quantum state). 

If we have two qubits, and measure them one at a time, 
we can collect at best two bits of information about the 
magnetic field. In principle, this could be enough to dis- 
tinguish perfectly among four possible values of the field. 
In practice, for a generic choice of four Hamiltonians la- 
beled by vectors a'^^-'^'^-'^\ the optimal information gain 
cannot be achieved by measuring the qubits one at a time. 
Rather a better strategy exploits quantum entanglement. 

An improved strategy can be formulated by following 
the paradigm of superdense coding 1 11 1 , whereby shared 
entanglement is exploited to enhance classical commu- 
nication between two parties. To implement superdense 
coding, the sender (Alice) and the receiver (Bob) use a 
shared Bell state 
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that they have prepared previously. Alice applies one of 
the four unitary operators {/, <Ji, 0^2,(^3} to her member 
of the entangled pair, and then sends it to Bob. Upon re- 
ceipt. Bob possesses one of the four mutually orthogonal 
Bell states 
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by performing an entangled Bell measurement (simulta- 
neous measurements of the commuting collective observ- 
ables (Ti (g) tJi and 0-3 (8) (T3), Bob can perfectly distinguish 
the states. Although only one qubit passes from Alice 
to Bob, two classical bits of information are transmitted 
and successfully decoded. In fact, this enhancement of 
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the transmission rate is optimal - with shared entangle- 
ment, no more than two classical bits can be carried by 
each transmitted qubit JT^ . 

The lesson of superdense coding is that entanglement 
can allow us to better distinguish operations on quantum 
states, and we may apply this method to the problem of 
distinguishing Hamiltonians.Q Let us imagine that the 
magnitude of the magnetic field is known, but not its 
direction - then we can choose our unit of time so that 
\a\ = 1. We may prepare a pair of qubits in the entangled 
state |</>^), and expose only one member of the pair to 
the magnetic field while the other remains well shielded. 
In time t, the state evolves to 

iVaW) =cxp {-itHa(S>I)\'p+) 

= [cost(/(g) /) - zsini(a • ct (g) /)] 
= cost\(f>'^) 

-i sint iai|-0+) - m2|'(/;") + asl^^)] ; (5) 

the inner product between the states arising from Hamil- 
tonians Ha and iJg becomes 

{Mt)\Mt)) = cos^ t + {a-b) sin^ t . (6) 
For these states to be orthogonal, we require 
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Since cot^ t > 0, the states are not orthogonal for any 
value of t unless the two magnetic field directions a and 
b are separated by at least 90°. 

Now suppose that the magnetic field (of known magni- 
tude) points in one of three directions that are related by 
three-fold rotational symmetry. These directions could 
form a planar trine with a-b = d- c = b- c— —1/2, or 
a "lifted trine" with angle 9 between each pair of direc- 
tions, where —1/2 < cos 61 < 0. For any such trine of field 
directions, we may evolve for a time t such that 
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and perform an (entangled) orthogonal measurement to 
determine the field. At the point of tetrahedral symme- 
try, COS0 = —1/3, we may add a fourth field direction 
such that the inner product for each pair of field direc- 
tions is —1/3; then all four directions can be perfectly 
distinguished by Bell measurement. 

In this case of four field directions with tetrahedral 
symmetry, the two-bit measurement outcome achieves 
a two-bit information gain, if the four directions were 
equally likely a priori. In contrast, no adaptive strategy 
in which single qubits are measured one at a time can at- 
tain a two-bit information gain. This separation between 
the information gain attainable through entangled mea- 
surement and that attainable through adaptive nonen- 
tangled measurement, for the problem of distinguishing 



Hamiltonians, recalls the analogous separation noted by 
Peres and Wootters for the problem of distinguishing 
nonorthogonal states. 



III. GROVER'S DATABASE SEARCH: 
IMPROVED DISTINGUISHABILITY THROUGH 
DRIVING 

Another instructive example is Grover's method p^ ] 
for searching an unsorted database, which (as formu- 
lated by Farhi and Gutmann p6|) we may interpret as 
a method for improving the distinguishability of a set of 
Hamiltonians by adding a controlled driving term. 

Consider an TV-dimensional Hilbert space with or- 
thonormal basis a; = 0, 1, 2, ... , iV— 1, and suppose 

that the Hamiltonian for this system is known to be one 
of the N operators 
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We are to perform an experiment that will allow us to 
estimate the value of x. 

We could, for example, prepare the initial state 
+ |y'))' allow the system to evolve for a time 
T = tt/E, and then perform an orthogonal measurement 
in the basis |±) = -^(|y) ± \y'))- Then we will obtain 
the outcome |— ) if and only if one of y,y' is x. Search- 
ing for X by this method, we would have to repeat the 
experiment for 0{N) distinct initial states to have any 
reasonable chance of successfully inferring the value of x. 

Our task becomes easier if we are able to modify the 
Hamiltonian by adding a term that we control to drive 
the system. We choose the driving term to be 



Hd=E\s){s\ 
where Is) denotes the state 
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Then the full Hamiltonian is 

K=Hx + Hd = E{\x){x\ + \s){s\) , (12) 

and we can readily verify that the vectors 

\E±) ^ \s) ± \x) (13) 

are (unconventionally normalized!) eigenstates of with 
the eigenvalues 



E+^ E l± 
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^This idea was suggested to us by Chris Fuchs 



We may prepare the initial state 

\s) = li\E+) + \E^)); 



(14) 



(15) 
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since the energy splitting is /S.E = 2Ej\fN , after a time 
T = 7r/A£; = TT^/2E , (16) 
this state flops to the state 

\{\E+) - \E-)) = \x) . (17) 

Thus, by performing an orthogonal measurement, we can 
learn the value of x with certainty . 

The driving term we have chosen is the continuous time 
analog of the iteration employed by Grover [|l5| for rapid 
searching. And as the Grover search algorithm can be 
seen to be optimal, in the sense that a marked state 
can be identified with high probability with the mini- 
mal number of oracle calls jl^, so the driving term wc 
have chosen is optimal in the sense that it enables us 
to identify the value of the classical parameter labeling 
the Hamiltonian in the minimal time, at least asymptot- 
ically for TV large. (In a physics experiment, the "oracle" 
is Nature, whose secrets we are eager to expose.) For 
this Grover- Farhi-Gutmann problem, we can make a def- 
inite statement about how to optimize expenditure of a 
valuable resource (time) in the identification of a system 
Hamiltonian. 

We also note that adding a driving term can sometimes 
improve the efficacy of the superdense coding method de- 
scribed in For example, in the case of three magnetic 
fields of equal magnitude with threefold symmetry, but 
with an angle between fields of less than 90°, applying a 
driving field along the line of symmetry can make the re- 
sultant field directions perfectly distinguishable. In fact, 
Beckman has shown that for any three field vectors 
forming a triangle that is isosceles or nearly isosceles, a 
suitable driving field can always by found such that the 
field directions can be distinguished perfectly. 

IV. DISTINGUISHING TWO ALTERNATIVES 

Let's consider the special case in which our apparatus 
is known to be governed by one of two possible Hamilto- 
nians Hi or H2. If the system is two dimensional, we are 
trying to distinguish two possible values a, b of the mag- 
netic field with a spin-i probe. Suppose for simplicity 
that the two fields have the same magnitude (normalized 
to unity), but differing directions. 

Assuming that we are unable to modify the Hamilto- 
nian by adding a driving term, the optimal strategy is to 
choose an initial polarization vector that bisects the two 
field directions a, h. Depending on the actual value of the 
field, the polarization will precess on one of two possible 
cones. If the angle between a and b is > 90°, then 
the two possible polarizations will eventually be back- 
to-back; an orthogonal measurement performed at that 
time will distinguish d and b perfectly. But \iO < 90°, the 
two polarizations are never back-to-back; the best strat- 
egy is to wait until the angle between the polarizations is 



maximal, and to then perform the orthogonal measure- 
ment that best distinguishes them. We cannot perfectly 
distinguish the two field directions by this method. 

On the other hand, if we are able to apply a known 
driving magnetic field in addition to the unknown field 
that is to be determined, then two fields a and b can 
always be perfectly distinguished. If we apply the field 
—6, then the problem is one of distinguishing the trivial 
Hamiltonian from 

HAm = {d-h)-a . (18) 

We can choose an initial polarization orthogonal to a — 6, 
and wait just long enough for i?diff to rotate the polar- 
ization by TT. Then an orthogonal measurement perfectly 
distinguishes -ffdiff from the trivial Hamiltonian. 

Evidently, the same strategy can be applied to distin- 
guish two Hamiltonians Hi and H2 in a Hilbert space 
of arbitrary dimension. We drive the system with —H2', 
then to distinguish the trivial Hamiltonian from Hi —H2, 
we chose the initial state 

^(l^min) + l^^max)) , (19) 

where i?min , ^-max ^rc the minimal and maximal eigen- 
values of Hi — i?2 • After a time t with 

t(i^max - Enun) = ^ , (20) 

this state evolves to the orthogonal state 
(l-Emin) — |£^max))j SO that the trivial and nontrivial 
Hamiltonians can be perfectly distinguished. 

In the case of the two-dimensional version of the 
"Grover problem" with ffi = |0)(0| and Ha = |1)(1|, this 
choice for the driving Hamiltonian actually outperforms 
the Grover driving term of Eq. ( [To| ) — the two Hamil- 
tonians can be distinguished in a time that is shorter by 
a factor of \/2. So while the Grover strategy is optimal 
for asymptotically large N, it is not actually optimal for 
iV = 2. 



V. DISTINGUISHING TWO ALTERNATIVES IN 
A FIXED TIME 

Let us now suppose that we are to distinguish between 
two time- independent Hamiltonians Hi and H2 , and that 
a Gxed duration t has been allotted to perform the exper- 
iment. Is the driving strategy described above (in which 
—H2 is added to the Hamiltonian) always the best pos- 
sible? 

If we have the freedom to add a driving term of our 
choice, then we may assume without loss of generality 
that we are to distinguish the nontrivial Hamiltonian H 
from the trivial Hamiltonian 0. As already noted, if the 
largest difference IS.E = E'max ~ -Emin of eigenvalues of H 
satisfies t/S.E > tt, then H can be perfectly distinguished 
from 0; let us therefore suppose that tAE < tt. 
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If we add a time-independent driving term K to the 
Hamiltonian, and choose an initial state |V'o)j then after 
a time t, we will need to distinguish the two states 
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Two pure states will be more distinguishable when their 
inner product is smaller. Therefore, to best distinguish 
H + K from we should choose \^o) to minimize the 
inner product 
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If we expand \ipo) in terms of the eigenstates {|a)} of 



qUK^ it{H+K) ^j^j^ eigenvalues {e 

a 

this inner product becomes 



-U(H+K) 
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The right-hand side of Eq. ( p^ is the modulus of a con- 
vex sum of points on the unit circle. Assuming the mod- 
ulus is bounded from zero, it attains its minimum when 
IV'o) is the equally weighted superposition of the extremal 
eigenstates of q^^k ^-it{H+K) _ those whose eigenvalues 
are maximally separated on the unit circle. For K = 0, 
the minimum is cos {tAE/2), where AE is the difference 
of the maximal and minimal eigenvalues of H. 

We prove in Appendix A that turning on a nonzero 
driving term K can never cause the extremal eigenvalues 
to separate further, and therefore can never improve the 
distinguishability of the two states in Eq. Therefore, 
if = is the optimal driving term for distinguishing two 
Hamiltonians. In other words, if we wish to distinguish 
between two Hamiltonians Hi and _ff2, it is always best 
to turn on a driving term that precisely cancels one of 
the two. 

The above discussion encompasses the strategy of in- 
troducing an ancilla entangled with the probe (which 
proved effective for the problem of distinguishing three or 
more alternatives). If we wish to distinguish two Hamil- 
tonians Hi® I and H^ ® I that both act trivially on the 
ancilla, the optimal driving term exactly cancels one of 
them (e.^., K = —H2 ® I), and so it too acts trivially on 
the ancilla. We derive no benefit from the ancilla when 
there are only two alternatives. 

Similarly, if we are trying to distinguish only two time- 
independent signals in an allotted time, it seems likely 
there is no advantage to performing a sequence of weak 
measurements, and adapting the driving field in response 
to the incoming stream of measurement data. 



VI. MORE ALTERNATIVES: ADAPTIVE 
DRIVING 

Now suppose that there are N possible Hamiltonians 
Hi , H2 , . ■ . , Hpf . If there is no time limitation, we can 
distinguish them perfectly by implementing an adaptive 
procedure; we make a series of measurements, modify- 
ing our driving term and initial state in response to the 
stream of measurement outcomes. 

The correct Hamiltonian can be identified by pairwise 
elimination. First, assume that either Hi or H2 is the 
actual Hamiltonian, and apply a driving term to per- 
fectly distinguish them, say H^ = —Hi. After preparing 
the appropriate initial state and waiting the appropriate 
time, we make an orthogonal measurement with two out- 
comes — the result indicates that either Hi or H2 is the 
actual Hamiltonian]^ If the result is Hi, there are two 
possibilities: either Hi really is the Hamiltonian, or the 
assumption that one of Hi or H2 is the Hamiltonian was 
wrong. Either way, H2 has been eliminated. Similarly, 
if H2 is found. Hi is eliminated. This procedure can 
then be repeated, eliminating one Hamiltonian per mea- 
surement, thereby perfectly distinguishing among the N 
Hamiltonians in a total of iV — 1 measurements. 

This algorithm is quite inefficient, however. The mea- 
surement record is — 1 bits long, while the information 
gain is only log A bits. 



VII. ADAPTIVE PHASE MEASUREMENT AND 
THE SEMICLASSICAL QUANTUM FOURIER 
TRANSFORM 

Far more efficient adaptive procedures can be formu- 
lated in some cases. Consider, for example, a single qubit 
in a magnetic field of known direction but unknown mag- 
nitude, so that 



(25) 



and let us imagine that the value of the frequency u is 
chosen equiprobably from among N = 2" equally spaced 
possible values. Without loss of generality, we may nor- 
malize the field so that the possible values range from 
to 1 — 2^"; then uj has a binary expansion 



■UJ1UJ2 ■ 



(26) 



that terminates after at most n bits. 

The initial state jV'o) = "^(1^) + l^)) evolves in time t 

to 



^That this might be the case was suggested to us by Chris 
Fuchs ||. 



■^Actually, in a Hilbert space of high dimension, we can make 
a more complete measurement that will typically return the 
result that neither Hi nor H2 is the actual Hamiltonian. 
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|^o) = ^(|0)+e— *|1)) (27) 



(up to an overall phase). If we wait for a time t„ — 7r2", 
the final state is 



I 
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(28) 



Now measurement in the {"iTjdO) ± |1)}) basis indicates 
(with certainty) whether the bit w„ is or I. This out- 
come divides the set of possible Hamiltonians in half, 
providing one bit of classical information. 

The set of remaining possible Hamiltonians is still 
evenly spaced, but it may have a constant offset, de- 
pending on the value of w„. However, the value of ujn 
is now known, so the offset can be eliminated. Specifi- 
cally, if we again prepare jV'o) and now evolve for a time 
tn-i = 7r2"~^, we obtain the final state 



mn-l))u 
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Since uJn is known, we can perform a phase transforma- 
tion (perhaps by applying an additional driving magnetic 
field) to eliminate the phase e~*'^ (•'^n); Measuring again 
in the {-^(|0) ± |1)}) basis determines the value of w„_i. 

By continuing this procedure until all bits of uj are 
known, we perfectly distinguish the 2" possible Hamilto- 
nians in just n measurements. The procedure is optimal 
in the sense that we gain one full bit of information about 
the Hamiltonian in each measurement. 

Up until now we have imagined that the frequency lu 
takes one of 2" equally spaced discrete values, but no 
such restriction is really necessary. Indeed, what we have 
described is precisely the implementation of the n-qubit 
semiclassical quantum Fourier transform as formulated 
by Griffiths and Niu (whose relevance to phase esti- 
mation was emphasized by Cleve et al. [^). Thus the 
same procedure can be applied to obtain an estimate of 
the frequency to n-bit precision, even if the frequency is 
permitted to take an arbitrary real value in the interval 
[0,1). 

Suppose that we attach to n spins the labels 
{0, 1, . . . , n — 2, n — 1}, and expose the fcth spin to the 
field for time 7r2'^+^; we thus prepare the n-qubit state 



n — 1 
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The adaptive algorithm is equivalent to the quantum 
Fourier transform followed by measurement; hence the 
n-bit measurement outcome lj occurs with probability 



Probi^(a)) 



2"-l 



^ exp[~27ri?;(w - di)] 
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If Lo really does terminate in n bits, then the outcome 
LU is guaranteed to be its correct binary expansion. But 
even if the binary expansion of lu does not terminate, the 
probability that our estimate uj is correct to n bits of 
precision is still of order one.^ 

Of course, to measure the frequency to a precision Auj 
of order 2~", we need to expose our probe spins to the 
unknown Hamiltonian for a total time T of order 27r • 2". 
The accuracy is limited by an energy-time uncertainty 
relation of the form TAw 1. 

The semiclassical quantum Fourier transform provides 
an elegant solution to the problem of performing an ideal 
"phase measurement" in the Hilbert space of n qubits. 
More broadly, any iV-dimensional Hilbert space with a 
preferred basis {|A:), fc = 0, 1, . . . , iV — 1} has a comple- 
mentary basis of phase states 



with 



N-l 
k—o 



^ = 2tij/N, j = 0,l,...,iV-l . 



(32) 
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For example, the Hilbert space could be the truncated 
space of a harmonic oscillator like a mode of the elec- 
tromagnetic field, with the occupation number restricted 
to be less than N] then the states \lp) are the "phase 
squeezed" states of the oscillator that have minimal phase 
uncertainty. Since a POVM in an TV-dimensional Hilbert 
space can acquire no more than log TV bits of information 
about the preparation of the quantum state, the phase 
of an oscillator with occupation number less than N can 
be measured to at best log iV bits of accuracy. While it 
is easy to do an orthogonal measurement in the occu- 
pation number basis with an efficient photodetector, an 
orthogonal measurement in the \ip) basis is quite difficult 
to realize in the laboratory . 

But if the standard basis is the computational basis 
in the 2"-dimensional Hilbert space of n qubits, then an 
ideal phase measurement is simple to realize. Since the 
phase eigenstates are actually not entangled states, we 
can carry out the measurement - adaptively - one qubit 
at a time. 

Note that if we had an arbitrarily powerful quantum 
computer with an arbitrarily large amount of quantum 
memory, then adaptive measurement strategies might 
seem superfiuous. We could achieve the same effect 
by introducing a large ancilla and a driving Hamilto- 
nian that acts on probe and ancilla, with all measure- 
ments postponed to the very end. But the semiclassi- 
cal quantum Fourier transform illustrates that adaptive 



''We might also use the QFT to compute eigenvalues of a 
known many-body Hamiltonian, rather than measure eigen- 



values of an unknown one 
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techniques can reduce the complexity of the quantum 
information processing required to perform the measure- 
ment. In many cases, an adaptive strategy may be real- 
izable in practice, while the equivalent unitary strategy 
is completely infeasible. 



VIII. DISTINGUISHABILITY AND 
DECOHERENCE 

In all of our examples so far, we have ignored noise and 
decoherence. In practice, decoherence may compromise 
our ability to decipher the classical signal with high con- 
fidence. Finding ways to improve measurement accuracy 
by effectively coping with decoherence is an important 
challenge faced by quantum information theory. 

If there is decoherence, our aim is to gain informa- 
tion about the value of a parameter in a master equation 
rather than a Hamiltonian. To be concrete, consider a 
single qubit governed by an unknown Hamiltonian H, 
and also subject to decoherence described by the "depo- 
larizing channel;" the density matrix p of the qubit obeys 
the master equation 



P ■ 



1 



-/ 



(34) 



where T is the (known) damping rate. If we express p in 
terms of the polarization vector P, 



and the Hamiltonian as 



then the master equation becomes 



P^uj{ax P)- TP 



(35) 



(36) 



(37) 



The polarization precesses uniformly with circular fre- 
quency uj about the a-axis as it contracts with lifetime 

T'\ 

Suppose that we are to distinguish among two possible 
Hamiltonians, which are assumed to be equiprobable. If 
we are able to add a driving term, we may assume that 
the two are the trivial Hamiltonian and 



H = - as . 
2 ^ 



(38) 



We choose the initial polarization vector Pq = (liO,0). 
Then if the Hamiltonian is trivial, the polarization con- 
tracts as 



Pit). 



-rt, 



cosut, sinujt, 0) 



(40) 



When is the best time to measure the polarization? 
We should wait until Ptiiv and Pnontiiv point in distin- 
guishable directions, but if we wait too long, the states 
will depolarize. The optimal measurement to distinguish 
the two is an orthogonal measurement of the polariza- 
tion along the axis normal to the bisector of the vectors 
P{t)tiiv and P(t)nontriv At time t the probability that 
this measurement identifies the Hamiltonian incorrectly 
is 



1 1 . 

terror -3^2^ 



sm 



(41) 



This error probability is minimized, and the information 
gain from the measurement is maximized, at a time t 
such that 



tanl^^^U^ 
2 2T 



(42) 



If r/cj << 1, this time is close to tt/w, the time we would 
measure to perfectly distinguish the Hamiltonians in the 
absence of decoherence. But if F/oj >> 1, then we should 
measure after a time t ~ F~^ comparable to the lifetime. 

More generally, consider an ensemble of two density 
operators pi and p2 with a priori probabilities pi and 
P2 (where pi + P2 = 1), and imagine that an unknown 
state has been drawn from this ensemble. A procedure 
for deciding whether the unknown state is pi or p2 can 
be modeled as a POVM with two outcomes. The two- 
outcome POVM that minimizes the probability of mak- 
ing an incorrect decision is a measurement of the orthog- 
onal projection onto the space spanned by the eigenstates 
of pipi — P2/52 with positive eigenvalues P3| , ^ . The min- 
imal error probability achieved by this measurement is 



terror = ^ ~ ^t'' IP^P^ --P2P2I 



(43) 



Correspondingly, if we are to identify an unknown su- 
peroperator as one of $i and $2 (with a priori probabil- 
ities pi and P2), then the way to distinguish $1, $2 with 
minimal probability of error is to choose our initial state 
Po = |i/'o)('/'o| to minimize| 



-Pcrror = ^ ^ ^^r |(j3l$l - P2$2) PO I 



(44) 



In the case of interest to us, the superoperators $1 and $2 
are obtained my integrating, for time t, master equations 
with Hamiltonians Hi and H2 respectively. We mini- 
mize the error probability in Eq. ( ^^ with respect to t 
to complete the optimization. 



P(i)triv = e^'*(l,0,0) , 



(39) 



while under the nontrivial Hamiltonian it contracts and 
rotates as 



'We thank Chris Fuchs for a helpful discussion of this point. 
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IX. ENTANGLEMENT AND FREQUENCY 
MEASUREMENT 

Consider again the case in which the Hamiltonian is 
known to be of the form 



0-3 



(45) 



but where the frequency lo is unknown. For the moment, 
let us neglect decoherence, but suppose that we have been 
provided with a large number n of qubits that we may 
use to perform an experiment to determine a; in a fixed 
total time t. What is the most effective way to employ 
our qubits? 

Consider two strategies. In the first, we prepare n 
identical qubits polarized along the x-axis. They precess 
in the field described by H,^ for time t, and then the spin 
along the x-axis is measured. Each spin will be found to 
be pointing "up" with probability 



P = i(l + cosuot) 



(46) 



Because the measurement is repeated many times, we 
will be able to estimate the probability P to an accuracy 



AP = ^JP{1 - P)/f 



I sin w^l 



and so determine the value of uj to accuracy 
AP 1 



Atj = 



t\dP/d{ujt)\ ty/n 



(47) 



(48) 



The accuracy improves like as we increase the num- 

ber of available qubits with the time t fixed. 

The second strategy is to prepare an entangled "cat" 
state of n ions 



|7/;o> = ^(1000... 0) + 1111...1)) 



(49) 



The advantage of the entangled state is that it precesses 
n times faster than a single qubit; in time t it evolves to 

1^(0) = ^(1000... 0) + e"-*|lll...l)) (50) 

(up to an overall phase). If we now perform an orthogonal 
measurement that projects onto the basis "^(|000 . . . 0)± 

|lll...l)) {e.g. a measurement of the entangled observ- 
able tTi (8) cTi (8) • • • (8) CTi) then we will obtain the "-I-" 
outcome with probability 



P = -(1 -I- cosncot) 



(51) 



By this method, nut can be measured to order one accu- 
racy, so that 



1 

Aw ~ — , 
tn 



(52) 



a more favorable scaling with n than in Eq. (48). 

This idea of exploiting the rapid precession of entan- 
gled states to achieve a precision beyond the shot-noise 
limit has been proposed in both frequency measurement 
psf and optical interferometry ||2^. (One realization of 
this idea is the proposal by Caves |p7|] to allow a squeezed 
vacuum state to enter the dark port of an interferometer; 
the squeezing induces the n photons entering the other 
port to make correlated "decisions" about which arm of 
the interferometer to follow.) 



X. ENTANGLEMENT VERSUS DECOHERENCE 



In both Eq. (|4^ ) and Eq. (|52|), the accuracy of the 
frequency measurement improves with the elapsed time t 
as 1/t. But so far we have neglected decoherence. If the 
single-qubit state decays at a rate F, then we have seen 
that the optimal time at which to perform a measurement 
will be of order F^^. The entangled strategy will still be 
better if we are constrained to perform the measurement 
in a time t << T~^, but further analysis is needed to 
determine which method is better if we are free to choose 
the time t to optimize the accuracy. 

In fact, as Huelga et al. ||2^ have emphasized, an en- 
tangled state is fragile, and its faster precession can be 
offset by its faster decay rate. Suppose that two qubits 
are available, both independently subjected to the de- 
polarizing channel with decay rate F. If we prepare the 
unentangled state, each qubit has the initial pure-state 
density matrix 

Po = ^(/ + ^i) (53) 
polarized along the x-axis, and evolves in time t to 

p(t) = i[/ + e"'"*(ai cosbjt + a-i sinut)] . (54) 

If we now measure cti , we obtain the -I- result with prob- 
ability 

P = trQ(/ + ai)p(t)^ = i(l + e-"coswi) . (55) 

Now suppose that the initial state is the Bell state |0+) 
of two qubits, with density matrix 

Po = '^{I ® I + (7z ® (Jz + (Ji ® (Ji - (72 ® cr2) ■ (56) 

If both spins precess and depolarize independently, this 
state evolves to 

p{t) = i[/(g)/ + e-2" ((73(8(73 

+ cos 2Ljt{ai (8 (7l — (72 (8 (72) 

sin2cjt((7i 8)(72 +(72 (8cti))] ; (57) 
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if we measure the observable cri (g) ci , we find the 
come with probability 

P = tr Q(/ (g) / + CTi (g) ai)p{t) 

= i(l + e~2^*cos2wt) . 



out- 



(58) 



Note that Eq. (|5^) has exactly the same functional 
form as Eq. (|55|), but with t replaced by 2t. Therefore, 
the entangled measurement performed in time t/2 col- 
lects exactly as much information about the frequency lj 
as the measurement of a single ion performed in time t. 
If we have two qubits and total time t available, we can 
either perform the entangled measurement twice (taking 
time t/2 each time), or perform measurements on each 
qubit independently (taking time t). Either way, we ob- 
tain two outcomes and collect exactly the same amount 
of information on the average. 

More generally, suppose that we have n qubits and 
a total time T » 1/T available. We can use these 
qubits to perform altogether nT/t independent single- 
qubit measurements, where each measurement requires 
time t. Plugging Eq. (H) into Eq. and Eq. (^) 

(with n replaced by nT/t), and choosing con tot ~ to 
optimize the precision, we find that the frequency can be 
determined to accuracy 



Aw 



1 



(59) 



This precision is optimized if we choose Tt — 1/2, where 
we obtain |28 



Aw 



2er 
nT 



(60) 



On the other hand, we could repeat the experiment T /t 
times using the n-qubit entangled state. Then we would 
obtain a precision 



Aw 



„nTt 



1 



„nTt 



(61) 



the same function as for uncorrelated qubits, but with t 
replaced by nt. Thus the optimal precision is the same 
in both cases, but is attained in the uncorrelated case by 
performing experiments that take n times longer than in 
the correlated case. 

That the entangled states offer no advantage in the 
determination of lo was one of the main conclusions of 
Huelga et al. |^^. A similar conclusion applies to esti- 
mating the difference in path length between two arms 
of an interferometer using a specified optical power, if we 
take into account losses and optimize with respect to the 
number of times the light bounces inside the interferom- 
eter before it escapes and is detected. 

We would like to make the (rather obvious) point that 
this conclusion can change if we adopt a different model 



of decoherence, and in particular if the qubits do not 
decohere independently. As a simple example of corre- 
lated decoherence, consider the case of two qubits with 
4x4 density matrix p evolving according to the master 
equation 



p = -z[i/,p]-r(p-//4) 



(62) 



This master equation exhibits the analog, in the four- 
dimensional Hilbert space, of the uniform contraction of 
the Bloch sphere described by the depolarizing channel 
in the case of a qubit. Because the decoherence picks out 
no preferred direction in the Hilbert space (or any pre- 
ferred tensor-product decomposition) , we call this model 
"symmetric decoherence." 

Under this master equation, with both qubits sub- 
jected to and to symmetric decoherence, the Bell 
state po — \'f^'^){4>'^\ evolves in time t to the state 

1, 



p(t) = -[/(g/-t-e-^* (aggers 
+ cos 2ujt{ai (g) (Ji — 172 (Xi o'2) 

-f sin2wt(CTi (8) (72 + (72 (g CTi))] 



(63) 



so that a measurement of ai (g) a\ yields the -I- outcome 
with probability 



~Tt 



COS 2wi) 



On the other thing, the initial product state 
1 



-(/ + cri) ® (Z + O-i) 



(64) 



(65) 



becomes entangled as a result of symmetric decoherence. 
Were the Hamiltonian trivial, it would evolve to 



4 4 



rt 



(^1 



) CTl -I- fJi ® CTl 1 



(66) 



(67) 



Including the precession 

a\ (Ti cos Lut + (72 sin cut , 
we obtain 

p{t) = (K) / + ie""(CTi (g) / cosujt + ■■■ ) , (68) 

so that measurement of the single-qubit observable di (g/ 
yields the + outcome with probability 



P = tr( ^{I<»I + ai^I)p{t) 



cos ujt) . 
(69) 



Comparing Eq. ( |69| ) and Eq. (|64|), the important thing 
to notice is that with symmetric decoherence, entangled 
states decay no faster than product states; therefore, we 
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can enjoy the benefit of entanglement (faster precession) 
without paying the price (faster decay). 

To establish more firmly that entangled strategies out- 
perform nonentangled strategies in the symmetric deco- 
herence model, we should consider more closely what are 
the optimal final measurements for these two types of 
initial states. To give the problem a precise information- 
theoretic formulation, we return to the problem of dis- 
tinguishing two cases, the trivial Hamiltonian and 
which are assumed to be equiprobable. For either the 
product initial state or the entangled initial state, we 
evolve for time t, and then perform the best measure- 
ment that distinguishes between evolution governed by 
and trivial evolution. In both cases, the measure- 
ment is permitted to be an entangled measurement; that 
is, we optimize with respect to all POVM's in the four- 
dimensional Hilbert space. 

In either case (initial product state or initial entangled 
state), we can find the two-outcome POVM that identi- 
fies the Hamiltonian with minimal probability of error. 
When there is no decoherence, this POVM (when re- 
stricted to the two-dimensional subspace containing the 
two pure states to be distinguished) is the familiar or- 
thogonal measurement that best distinguishes two pure 
states of a qubit. In fact, for symmetric decoherence, 
this same measurement minimizes the error probability 
for any value of the damping rate T. It is thus the 
two-outcome measurement with the maximal information 
gain (the measurement outcome has maximal mutual in- 
formation with the choice of the Hamiltonian) . Although 
we don't have a proof, we can make a reasonable guess 
that, for symmetric decoherence, this two-outcome mea- 
surement has the max;imal information gain of any mea- 
surement, including POVM's with more outcomes. 

If either initial state evolves for time t, and then this 
optimal POVM is performed, the error probability can 
be expressed as 



f'error = \- ^c""^* |sin ^(t)! ; 



(70) 



here 6{t) is the angle between the states — that is, 
cos 6{t) is the inner product of the evolving and static 
states, in the limit of no damping (F = 0). For the en- 
tangled initial state, we have 



^entangled — i 

and for the product initial state, we have 



cos ^product = COS^ 



(71) 



(72) 



Since 



I cos ^entangled I = \ COS Ujt\ < -{I + COS Ujt) = \ COS 6'product | 

(73) 

for COS 6'entangied > 0) the error probability achieved by 
the entangled initial state is smaller than that achieved 



by the product state for < ujt < tt/2, which is suffi- 
cient to ensure that the error probability optimized with 
respect to t is always smaller in the entangled case for 
any nonzero value of F. Similarly, if we optimize the in- 
formation gain with respect to t, the entangled strategy 
has the higher information gain for all F > 0. The im- 
provement in information gain (in bits) achieved using an 
entangled initial state rather than a product initial state 
is plotted in Fig. 1 as a fimction of T/lo. The maximum 
improvement of about .136 bits occurs for T/iv ~ .379. 




FIG. 1. Improvement in information gain (in bits) achieved 
by using an entangled initial state, as a function of the ratio 
of decoherence rate F to precession frequency uj. 

We have already seen in §11 that, even in the absence 
of decoherence, an entangled strategy may outperform 
an unentangled strategy if we are trying to distinguish 
more than two alternatives. This advantage will persist 
when sufficiently weak decoherence is included, whether 
correlated or uncorrelatcd. In that event, since only one 
member of an entangled pair is exposed to the unknown 
Hamiltonian, we may be able to shelter the other member 
of the pair from the ravages of the environment, slowing 
the decay of the state and strengthening the signal. 



XI. CONCLUSIONS 

We feel that quantum information theory, having al- 
ready secured a central position at the foundations of 
computer science, will eventually erect bridges connect- 
ing with many subfields of physics. The results reported 
here (and other related examples) give strong hints that 
ideas emerging from the theory of quantum information 
and computation are destined to profoundly influence the 
experimental physics techniques of the future. 

We have only scratched the surface of this important 
subject. Among the many issues that deserve further 
elaboration are the connections between superoperator 
distinguisliability and superoperator norms, the efficacy 
of the quantum Fourier transform in the presence of de- 
coherence, the measurement of continuous quantum vari- 
ables, the applications of quantum error correction, and 
the detection of time-dependent signals. 
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APPENDIX A: FIXED-TIME-DRIVING 
THEOREM 

In this appendix, we sketch the proof of the theorem 
stated in §V. 

For a unitary N x N matrix [/, we define maxarg(t7) to 
be the largest argument of an eigenvalue of U, where the 
argument takes values in the interval (— tt, tt]. Similarly, 
minarg(C/) is the minimum argument of an eigenvalue of 
U. Our theorem is: 

Theorem 1. If H and K are Gnite-dimensional Her- 
mitian matrices, and \\ H ||sup< tt, then 

maxarg(e*^e"*'^+^') < maxarg(e-*-^) , (74) 
minarg (e'^ e~'^"+^^^ > minarg(e"'-f^) . (75) 

To prove the theorem, we begin with: 

Lemma 2. For unitary U with maxarg(J7) ^ tt, and 
Hermitian A, 

maxarg(C/e*^'^) < maxarg(C/) + maxarg(e*''^) -f O(e^) , 

(76) 

minarg(C/e*'^'^) > minarg(C/) + minarg(e*^'*) — O(e^) . 

(77) 

Proof: Write U = e*^, where B is Hermitian and 
II B ||sup< tt; then maxarg(e*^) = max(i3), where 
max(i?) denotes the maximum eigenvalue of B. From 
the Baker-Campbell-Hausdorff formula, we have 



exp I 



B 



-eA+-e[C,B]+Ois')] , (78) 



where C is linear in A. Then lowest-order eigenvalue 
perturbation theory tells us that 



max [ B + eA + -e[C,B] 



max(B) + {tP\ sA + -e[C, B] \^j) + O(e^) 



= max(B) + (V-l (eA) {^jj) + 0{e'^) 
< max(_B) -I- max(£A) + O(e^) 



(79) 



(where {ip) is in the eigenspace of B with maximal eigen- 
value). This proves Eq. ([76|). Eq. (^ is proved simi- 
larly. Note that the condition maxarg(U) 7^ tt is neces- 
sary so that the singularity of the maxarg function can 
be avoided for e sufficiently small. 

Lemma 2 is all wc will need for the proof of Theorem 
1. But it is useful to note that Lemma 2 may be invoked 
to prove: 

Lemma 3.^ For unitary Ui and U2, such that 



maxarg([/i) -I- maxarg(?72) < j 
minarg([/i) + minarg(J72) > — tt , 

we have 

maxarg (J7it/2) < maxarg(t/i) +maxarg([/2) 
minarg (L/it/2) > minarg(J7i) +minarg([/2) 



Proof We write 

U1U2 = Uie'^ 



1 (e'^/")" , 



(80) 
(81) 



(82) 
(83) 



(84) 



where the eigenvalues of A lie in the interval (— tt, tt), and 
apply Lemma 2 repeatedly, obtaining 

maxarg (C/ie*'^) < maxarg ([/i) 

maxarg(e'^/") + 0(n-2)l . (85) 



Taking the n ^ 00 limit proves Eq. (|82|). Eq. (g3|) is 
proved similarly. Note that because of the conditions 
Eq. ( ^ ) and Eq. (^, Lemma 2 can be safely apphed n 
times in succession; the accumulated maxarg and minarg 
of the product never approach tt. 

To complete the proof of Theorem 1, we invoke the Lie 
product formula 



lim (e-^/^e-^/")" 



(86) 



® Strangely, we could find only one reference to this proposi- 
tion in the literature; it is a special case of Eq. (8) in [g9| . 
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to write 

= lim e*-^/"---e'-^/"e"'-f^/"e-'-^/"---e-'-f^/"e-*-^/" . (87) 



Since e^-^/"e-*^/"e-'-^/" and e'^"/'^ have the same 
eigenvalues, Lemma 3 implies that 



(88) 



maxarg(e*-^/"e~*-f^/"e-*^/"e-*-f^/") 
< 2 • maxarg(e~'^/") . 

Similarly, we have 

maxarg^e'-^/" (^e^K/n^-^H/n^-iK/n^-^H/n\ ^-tK/n^-,H/, 

< 3 • maxarg(e"'-^/") , 



(89) 



and so on. Hence, applying Lemma 3 altogether n times 
to the right-hand side of Eq. (B^, we find that 



maxarg 



< n ■ maxarg *^/"^ 

= maxarg(e"^-^) (90) 
(91) 



Taking the n —> oo limit completes the proof of Eq. (74). 
Eq. ( [Tq ) is proved similarly. 

The upper bound on || 77 ||sup is a key feature of the 
formulation of Theorem 1 . This bound ensures that the 
conditions Eq. (|8^) and Eq. (^) are satisfied each time 
that Lemma 3 is invoked in the proof. If || // ||sup is too 
large, then counterexamples can be constructed. 

In any event, for the discussion in §V, we are interested 
in the case where the maximal and minimal eigenvalues 
of H differ by less than tt, and by shifting by a con- 
stant we can ensure that || H ||sup< tt/2. Therefore, the 
theorem enforces the conclusion that if we are to distin- 
guish a nontrivial Hamiltonian from the trivial Hamilto- 
nian in an experiment conducted in a fixed elapsed time, 
turning on a nonzero time-independent "driving term" 
K provides no advantage. 
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